Goto

Collaborating Authors

 strong detection


The Algorithmic Phase Transition in Symmetric Correlated Spiked Wigner Model

Li, Zhangsong

arXiv.org Machine Learning

We study the computational task of detecting and estimating correlated signals in a pair of spiked Wigner matrices. Our model consists of observations $$ X = \tfracλ{\sqrt{n}} xx^{\top} + W \,, \quad Y = \tfracμ{\sqrt{n}} yy^{\top} + Z \,. $$ where $x,y \in \mathbb R^n$ are signal vectors with norm $\|x\|,\|y\| \approx\sqrt{n}$ and correlation $\langle x,y \rangle \approx ρ\|x\|\|y\|$, while $W,Z$ are independent Gaussian Wigner matrices. We propose an efficient algorithm that succeeds whenever $F(λ,μ,ρ)>1$, where $$ F(λ,μ,ρ)=\max\Big\{ λ,μ, \frac{ λ^2 ρ^2 }{ 1-λ^2+λ^2 ρ^2 } + \frac{ μ^2 ρ^2 }{ 1-μ^2+μ^2 ρ^2 } \Big\} \,. $$ Our result shows that an algorithm can leverage the correlation between the spikes to detect and estimate the signals even in regimes where efficiently recovering either $x$ from $X$ alone or $y$ from $Y$ alone is believed to be computationally infeasible. We complement our algorithmic result with evidence for a matching computational lower bound. In particular, we prove that when $F(λ,μ,ρ)<1$, all algorithms based on {\em low-degree polynomials} fails to distinguish $(X,Y)$ with two independent Wigner matrices. This low-degree analysis strongly suggests that $F(λ,μ,ρ)=1$ is the precise computation threshold for this problem.


Detecting Arbitrary Planted Subgraphs in Random Graphs

Elimelech, Dor, Huleihel, Wasim

arXiv.org Artificial Intelligence

The problems of detecting and recovering planted structures/subgraphs in Erd\H{o}s-R\'{e}nyi random graphs, have received significant attention over the past three decades, leading to many exciting results and mathematical techniques. However, prior work has largely focused on specific ad hoc planted structures and inferential settings, while a general theory has remained elusive. In this paper, we bridge this gap by investigating the detection of an \emph{arbitrary} planted subgraph $\Gamma = \Gamma_n$ in an Erd\H{o}s-R\'{e}nyi random graph $\mathcal{G}(n, q_n)$, where the edge probability within $\Gamma$ is $p_n$. We examine both the statistical and computational aspects of this problem and establish the following results. In the dense regime, where the edge probabilities $p_n$ and $q_n$ are fixed, we tightly characterize the information-theoretic and computational thresholds for detecting $\Gamma$, and provide conditions under which a computational-statistical gap arises. Most notably, these thresholds depend on $\Gamma$ only through its number of edges, maximum degree, and maximum subgraph density. Our lower and upper bounds are general and apply to any value of $p_n$ and $q_n$ as functions of $n$. Accordingly, we also analyze the sparse regime where $q_n = \Theta(n^{-\alpha})$ and $p_n-q_n =\Theta(q_n)$, with $\alpha\in[0,2]$, as well as the critical regime where $p_n=1-o(1)$ and $q_n = \Theta(n^{-\alpha})$, both of which have been widely studied, for specific choices of $\Gamma$. For these regimes, we show that our bounds are tight for all planted subgraphs investigated in the literature thus far\textemdash{}and many more. Finally, we identify conditions under which detection undergoes sharp phase transition, where the boundaries at which algorithms succeed or fail shift abruptly as a function of $q_n$.


Low coordinate degree algorithms I: Universality of computational thresholds for hypothesis testing

Kunisky, Dmitriy

arXiv.org Machine Learning

We study when low coordinate degree functions (LCDF) -- linear combinations of functions depending on small subsets of entries of a vector -- can hypothesis test between high-dimensional probability measures. These functions are a generalization, proposed in Hopkins' 2018 thesis but seldom studied since, of low degree polynomials (LDP), a class widely used in recent literature as a proxy for all efficient algorithms for tasks in statistics and optimization. Instead of the orthogonal polynomial decompositions used in LDP calculations, our analysis of LCDF is based on the Efron-Stein or ANOVA decomposition, making it much more broadly applicable. By way of illustration, we prove channel universality for the success of LCDF in testing for the presence of sufficiently "dilute" random signals through noisy channels: the efficacy of LCDF depends on the channel only through the scalar Fisher information for a class of channels including nearly arbitrary additive i.i.d. noise and nearly arbitrary exponential families. As applications, we extend lower bounds against LDP for spiked matrix and tensor models under additive Gaussian noise to lower bounds against LCDF under general noisy channels. We also give a simple and unified treatment of the effect of censoring models by erasing observations at random and of quantizing models by taking the sign of the observations. These results are the first computational lower bounds against any large class of algorithms for all of these models when the channel is not one of a few special cases, and thereby give the first substantial evidence for the universality of several statistical-to-computational gaps.


Detection of Correlated Random Vectors

Elimelech, Dor, Huleihel, Wasim

arXiv.org Artificial Intelligence

In this paper, we investigate the problem of deciding whether two standard normal random vectors $\mathsf{X}\in\mathbb{R}^{n}$ and $\mathsf{Y}\in\mathbb{R}^{n}$ are correlated or not. This is formulated as a hypothesis testing problem, where under the null hypothesis, these vectors are statistically independent, while under the alternative, $\mathsf{X}$ and a randomly and uniformly permuted version of $\mathsf{Y}$, are correlated with correlation $\rho$. We analyze the thresholds at which optimal testing is information-theoretically impossible and possible, as a function of $n$ and $\rho$. To derive our information-theoretic lower bounds, we develop a novel technique for evaluating the second moment of the likelihood ratio using an orthogonal polynomials expansion, which among other things, reveals a surprising connection to integer partition functions. We also study a multi-dimensional generalization of the above setting, where rather than two vectors we observe two databases/matrices, and furthermore allow for partial correlations between these two.


Testing Dependency of Unlabeled Databases

Paslev, Vered, Huleihel, Wasim

arXiv.org Artificial Intelligence

In this paper, we investigate the problem of deciding whether two random databases $\mathsf{X}\in\mathcal{X}^{n\times d}$ and $\mathsf{Y}\in\mathcal{Y}^{n\times d}$ are statistically dependent or not. This is formulated as a hypothesis testing problem, where under the null hypothesis, these two databases are statistically independent, while under the alternative, there exists an unknown row permutation $\sigma$, such that $\mathsf{X}$ and $\mathsf{Y}^\sigma$, a permuted version of $\mathsf{Y}$, are statistically dependent with some known joint distribution, but have the same marginal distributions as the null. We characterize the thresholds at which optimal testing is information-theoretically impossible and possible, as a function of $n$, $d$, and some spectral properties of the generative distributions of the datasets. For example, we prove that if a certain function of the eigenvalues of the likelihood function and $d$, is below a certain threshold, as $d\to\infty$, then weak detection (performing slightly better than random guessing) is statistically impossible, no matter what the value of $n$ is. This mimics the performance of an efficient test that thresholds a centered version of the log-likelihood function of the observed matrices. We also analyze the case where $d$ is fixed, for which we derive strong (vanishing error) and weak detection lower and upper bounds.


Precise Error Rates for Computationally Efficient Testing

Moitra, Ankur, Wein, Alexander S.

arXiv.org Machine Learning

We revisit the fundamental question of simple-versus-simple hypothesis testing with an eye towards computational complexity, as the statistically optimal likelihood ratio test is often computationally intractable in high-dimensional settings. In the classical spiked Wigner model (with a general i.i.d. spike prior) we show that an existing test based on linear spectral statistics achieves the best possible tradeoff curve between type I and type II error rates among all computationally efficient tests, even though there are exponential-time tests that do better. This result is conditional on an appropriate complexity-theoretic conjecture, namely a natural strengthening of the well-established low-degree conjecture. Our result shows that the spectrum is a sufficient statistic for computationally bounded tests (but not for all tests). To our knowledge, our approach gives the first tool for reasoning about the precise asymptotic testing error achievable with efficient computation. The main ingredients required for our hardness result are a sharp bound on the norm of the low-degree likelihood ratio along with (counterintuitively) a positive result on achievability of testing. This strategy appears to be new even in the setting of unbounded computation, in which case it gives an alternate way to analyze the fundamental statistical limits of testing.


Phase Transitions in the Detection of Correlated Databases

Elimelech, Dor, Huleihel, Wasim

arXiv.org Artificial Intelligence

We study the problem of detecting the correlation between two Gaussian databases $\mathsf{X}\in\mathbb{R}^{n\times d}$ and $\mathsf{Y}^{n\times d}$, each composed of $n$ users with $d$ features. This problem is relevant in the analysis of social media, computational biology, etc. We formulate this as a hypothesis testing problem: under the null hypothesis, these two databases are statistically independent. Under the alternative, however, there exists an unknown permutation $\sigma$ over the set of $n$ users (or, row permutation), such that $\mathsf{X}$ is $\rho$-correlated with $\mathsf{Y}^\sigma$, a permuted version of $\mathsf{Y}$. We determine sharp thresholds at which optimal testing exhibits a phase transition, depending on the asymptotic regime of $n$ and $d$. Specifically, we prove that if $\rho^2d\to0$, as $d\to\infty$, then weak detection (performing slightly better than random guessing) is statistically impossible, irrespectively of the value of $n$. This compliments the performance of a simple test that thresholds the sum all entries of $\mathsf{X}^T\mathsf{Y}$. Furthermore, when $d$ is fixed, we prove that strong detection (vanishing error probability) is impossible for any $\rho<\rho^\star$, where $\rho^\star$ is an explicit function of $d$, while weak detection is again impossible as long as $\rho^2d\to0$. These results close significant gaps in current recent related studies.


Planted Bipartite Graph Detection

Rotenberg, Asaf, Huleihel, Wasim, Shayevitz, Ofer

arXiv.org Artificial Intelligence

We consider the task of detecting a hidden bipartite subgraph in a given random graph. Specifically, under the null hypothesis, the graph is a realization of an Erd\H{o}s-R\'{e}nyi random graph over $n$ vertices with edge density $q$. Under the alternative, there exists a planted $k_{\mathsf{R}} \times k_{\mathsf{L}}$ bipartite subgraph with edge density $p>q$. We derive asymptotically tight upper and lower bounds for this detection problem in both the dense regime, where $q,p = \Theta\left(1\right)$, and the sparse regime where $q,p = \Theta\left(n^{-\alpha}\right), \alpha \in \left(0,2\right]$. Moreover, we consider a variant of the above problem, where one can only observe a relatively small part of the graph, by using at most $\mathsf{Q}$ edge queries. For this problem, we derive upper and lower bounds in both the dense and sparse regimes.